[model]Add UltraFlux-v1-image support#611
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7af3b2287a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "pipeline_wan2_2", | ||
| "Wan22Pipeline", | ||
| ), |
There was a problem hiding this comment.
Preserve existing diffusion model registrations
This registry block now omits several existing pipelines (e.g., WanImageToVideoPipeline, BagelPipeline, LongCatImageEditPipeline, StableDiffusion3Pipeline) even though their modules still exist under vllm_omni/diffusion/models. Since initialize_model looks up od_config.model_class_name in _DIFFUSION_MODELS and raises a ValueError when missing, any config that previously used those model_class_name values will now fail to initialize. Please restore those entries or explicitly deprecate them across docs/configs.
Useful? React with 👍 / 👎.
| #UltraFlux | ||
| "FluxPipeline": ( | ||
| "ultraflux-v1_image", | ||
| "pipeline_ultraflux", | ||
| "UltraFluxPipeline", |
There was a problem hiding this comment.
Register UltraFlux under UltraFluxPipeline key
The new UltraFlux entry is keyed as FluxPipeline, but the actual class and docs use UltraFluxPipeline. If users set model_class_name=UltraFluxPipeline as documented, _DIFFUSION_MODELS lookup will fail and initialize_model will raise “Model class UltraFluxPipeline not found.” Please register it under UltraFluxPipeline (or update naming consistently) so the model is loadable.
Useful? React with 👍 / 👎.
| { | ||
| "WanPipeline": enable_cache_for_wan22, | ||
| "FluxPipeline": enable_cache_for_flux, | ||
| "UltraFluxPipeline": enable_cache_for_flux, |
There was a problem hiding this comment.
maybe you need to update new func name such as enable_cache_for_ultraflux
| "pipeline_wan2_2", | ||
| "Wan22Pipeline", | ||
| ), | ||
| "WanImageToVideoPipeline": ( |
There was a problem hiding this comment.
do not delete, just add your pipeline
| "StableDiffusion3Pipeline", | ||
| ), | ||
| #UltraFlux | ||
| "FluxPipeline": ( |
There was a problem hiding this comment.
rename UltraFluxPipeline
| od_config: OmniDiffusionConfig, | ||
| ): | ||
| model_class = DiffusionModelRegistry._try_load_model_cls(od_config.model_class_name) | ||
| print("DEBUG model_class =", model_class) |
| "LongCatImagePipeline", | ||
| ), | ||
| "BagelPipeline": ( | ||
| "BagelPipeline": ( |
| # where mod_folder and mod_relname are defined and mapped using `_DIFFUSION_MODELS` via the `arch` key | ||
| "QwenImageEditPipeline": "get_qwen_image_edit_pre_process_func", | ||
| "QwenImageEditPlusPipeline": "get_qwen_image_edit_plus_pre_process_func", | ||
| "QwenImageLayeredPipeline": "get_qwen_image_layered_pre_process_func", |
| @@ -4,7 +4,7 @@ | |||
| import multiprocessing as mp | |||
There was a problem hiding this comment.
adding a model should not change diffusion_engine
| @@ -31,6 +31,7 @@ th { | |||
| |`LongcatImagePipeline` | LongCat-Image | `meituan-longcat/LongCat-Image` | | |||
There was a problem hiding this comment.
| - "vllm_omni.entrypoints.async_diffusion" # avoid importing vllm in mkdocs building | ||
| - "vllm_omni.entrypoints.openai" # avoid importing vllm in mkdocs building | ||
| - "vllm_omni.entrypoints.openai.protocol" # avoid importing vllm in mkdocs building | ||
| - "vllm_omni.entrypoints.omni" # avoid importing vllm in mkdocs building |
There was a problem hiding this comment.
why we need to change this?
|
|
||
| # Summarize and print stats | ||
| try: | ||
| import json as _json |
There was a problem hiding this comment.
why we need to change this? @Bounty-hunter PTAL
| @@ -4,6 +4,7 @@ | |||
| from dataclasses import dataclass | |||
There was a problem hiding this comment.
this PR is designed for adding a model, you should not make any changes to these comment files
| def _initialize_stages(self, model: str, kwargs: dict[str, Any]) -> None: | ||
| """Initialize stage list management.""" | ||
| stage_init_timeout = kwargs.get("stage_init_timeout", 20) | ||
| # Diffusion/large models can take long to load; align default with CLI (300s) |
There was a problem hiding this comment.
you should not change this default value, you need to provide this in your cli
| pass | ||
| logger.debug("Engine initialized") | ||
|
|
||
| # Check if stage engine supports profiling (via vLLM's built-in profiler) |
There was a problem hiding this comment.
why you need change omni_stage in the model support PR?
| } | ||
| ) | ||
|
|
||
| return result |
There was a problem hiding this comment.
should not change in this PR
560c1a1 to
448588b
Compare
| "StableDiffusion3Pipeline", | ||
| ), | ||
| #UltraFlux | ||
| "FluxPipeline": ( |
| @@ -0,0 +1,931 @@ | |||
| # Copyright 2025 Black Forest Labs, The HuggingFace Team and The InstantX Team. All rights reserved. | |||
| @@ -33,7 +33,7 @@ th { | |||
| |`LongCatImageEditPipeline` | LongCat-Image-Edit | `meituan-longcat/LongCat-Image-Edit` | | |||
There was a problem hiding this comment.
please also update diffusion acceleration md for cache dit support
|
@david6666666 can we not use benchmark serving under benchmarks folder for t2i jobs |
|
Please make similar modifications based on the review comments in #809.
|
8fc9d02 to
67a48fb
Compare
|
|
|
| |`Qwen3TTSForConditionalGeneration` | Qwen3-TTS-12Hz-1.7B-VoiceDesign | `Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign` | | ||
| |`Qwen3TTSForConditionalGeneration` | Qwen3-TTS-12Hz-1.7B-Base | `Qwen/Qwen3-TTS-12Hz-0.6B-Base` | | ||
|
|
||
| |`UltraFluxPipeline` | UltraFlux-v1 | `Owen777/UltraFlux-v1` | |
There was a problem hiding this comment.
Looks like this diff might have accidentally removed the three Qwen3-TTS entries — probably a rebase artifact? The UltraFlux line should be added alongside them.
| self.default_sample_size = 64 | ||
|
|
||
| print(self.vae.config) | ||
| print(self.transformer.config) |
There was a problem hiding this comment.
A few debug print() calls here — might want to swap them for logger.debug() or remove before merging.
| scheduler=scheduler, | ||
| ) | ||
|
|
||
| self.vae_scale_factor = 32 |
There was a problem hiding this comment.
Quick question — vae_scale_factor = 32 while standard Flux uses 16. Is 32 correct for UltraFlux, or a copy-paste from somewhere? If it's intentional, a brief comment explaining why would be helpful.
| def _get_fused_projections(attn: "FluxAttention", hidden_states, encoder_hidden_states=None): | ||
| query, key, value = attn.to_qkv(hidden_states).chunk(3, dim=-1) | ||
|
|
||
| encoder_query = encoder_key = encoder_value = (None,) |
There was a problem hiding this comment.
I think there might be a small issue here — (None,) creates a 1-tuple rather than assigning None to all three variables. This could cause problems downstream when calling .unflatten(...) on a tuple. Maybe:
encoder_query = encoder_key = encoder_value = None| mscale = torch.where( | ||
| scale <= 1.0, torch.tensor(1.0, device=scale.device, dtype=scale.dtype), 0.1 * torch.log(scale) + 1.0 | ||
| ) | ||
| mscale = torch.where( |
There was a problem hiding this comment.
Looks like this mscale computation is a duplicate of lines 612-614. Probably a copy-paste leftover — the second one could be removed.
| self.added_kv_proj_dim = added_kv_proj_dim | ||
| self.added_proj_bias = added_proj_bias | ||
|
|
||
| self.norm_q = torch.nn.RMSNorm(dim_head, eps=eps, elementwise_affine=elementwise_affine) |
There was a problem hiding this comment.
I saw in the earlier review that the maintainer suggested using from vllm.model_executor.layers.layernorm import RMSNorm instead of torch.nn.RMSNorm. Looks like it still needs to be updated here and on lines 300, 311, 312.
| for tok_name in ("tokenizer", "tokenizer_2"): | ||
| tok = getattr(self.pipe, tok_name, None) | ||
| if tok is not None and hasattr(tok, "model_max_length"): | ||
| tok.model_max_length = 512 |
There was a problem hiding this comment.
Just wondering — hardcoding tok.model_max_length = 512 overrides whatever the tokenizer originally had. Would it make sense to read this from the model config instead?
| "LongCatImagePipeline": enable_cache_for_longcat_image, | ||
| "LongCatImageEditPipeline": enable_cache_for_longcat_image, | ||
| "UltraFluxPipeline": enable_cache_for_ultraflux, | ||
| "LongcatImagePipeline": enable_cache_for_longcat_image, |
There was a problem hiding this comment.
The LongCatImagePipeline -> LongcatImagePipeline rename seems like a separate change. Might be worth mentioning in the PR description, or splitting it out if you prefer?
| self.tokenizer_max_length = ( | ||
| self.tokenizer.model_max_length if hasattr(self, "tokenizer") and self.tokenizer is not None else 77 | ||
| ) | ||
| self.default_sample_size = 64 |
There was a problem hiding this comment.
With default_sample_size = 64 and vae_scale_factor = 32, the default resolution would be 2048x2048, but the test plan uses 4096x4096. Is 2048 the intended default?
|
@erfgss Hello, any updates? |
yes,i will update this pr |
Signed-off-by: Chen Yang <[email protected]>
5e1bb37 to
36921b5
Compare
Signed-off-by: Chen Yang <[email protected]>
Signed-off-by: Chen Yang <[email protected]>
dff2731 to
33ef7bd
Compare
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
Add UltraFlux-v1-image support #327
Test Plan
Test Result
without cache_dit
with cache_dit
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)